combine multiple htmls into one file
To remove materials between tags
1 use ultraedit to open file, remember to convert to unicode if chinese inside
2 remove ^p to allow find and replace over multiple lines
if ^p is significant, use marker string to replace, and recover afterwards
3 use regular expression to remove material between tags
note the following code for trial
================
txt=readLines("all.htm",encoding = "big5")
sink("newall.htm")
cat(gsub("